perf: Use OpenCV over PIL for PNG encoding in `ImageRef.from_pil` by maxdswain · Pull Request #562 · docling-project/docling-core

maxdswain · 2026-03-22T20:43:09Z

Overview
The ImageRef.from_pil class method is used widely in docling's codebase. It is often used several times per page when parsing documents in the docling_parse.pdf_parser.PdfDocument._to_bitmap_resources_from_decoder method. From my profiling, I found that it took up ~45% of processing time when doing a DocumentConverter conversion with all AI models disabled. This led me looking into how it's performance can be improved.

The function uses pillow to encode the image to a png, which is notoriously slow. So I swapped it out with opencv, improving the performance of this function by ~55% for this simple test case:

import timeit

setup = """
from PIL import Image as PILImage
from docling_core.types.doc import ImageRef
"""

stmt = """
fig_image = PILImage.new(mode="RGBA", size=(200, 400), color=(0, 0, 0))
image = ImageRef.from_pil(image=fig_image, dpi=72)
"""

result = timeit.timeit(stmt=stmt, setup=setup, number=1000)
print(f"1000 runs: {result:.4f}s")
print(f"Per call:  {result:.4f}ms")
# Before: 1000 runs: 3.9219s, Per call:  3.9219ms
# After: 1000 runs: 1.7429s, Per call:  1.7429ms

When using these changes in the main docling repo, it reduced by conversion time from 14.2 to 9.21 (~35%) when disabling all AI models.

One caveat is that I did add an extra dependency opencv-python-headless, however this is already a dependency in the main docling repo's uv.lock.

Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>

github-actions · 2026-03-22T20:43:19Z

✅ DCO Check Passed

Thanks @maxdswain, all your commits are properly signed off. 🎉

mergify · 2026-03-22T20:43:46Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

cau-git · 2026-03-23T15:23:56Z

@maxdswain it would definitely be welcome to have this performance bottleneck addressed. However, opencv-python adds some intricacies, since it exists in both flavours: opencv-python and opencv-python-headless, which must not be co-installed in the same environment although nothing prevents it from the package manager perspective.

In order to support this cleanly, and knowing that other, optional third-party dependencies such as OCR engines in docling favour partially one and partially the other flavour, we would have to:

Make sure there is a fallback path with pillow if neither flavours of opencv-python are installed (one could put a check and assign a module-global variable such as CV2_INSTALLED.
Define opencv as optional dependency, as we already do in docling-ibm-models here.
Declare correctly the two flavours as conflicting, as seen here.

Even so it would not yet be a complete solution, since every dependent of docling-core would have to explicitly choose one of the extras to make use of the acceleration.

Would you like to take these adjustments into account?

perf: Use OpenCV over PIL for PNG encoding

9cf8b6e

Signed-off-by: Max Swain <89113255+maxdswain@users.noreply.github.com>

cau-git self-requested a review March 30, 2026 07:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Use OpenCV over PIL for PNG encoding in `ImageRef.from_pil`#562

perf: Use OpenCV over PIL for PNG encoding in `ImageRef.from_pil`#562
maxdswain wants to merge 1 commit intodocling-project:mainfrom
maxdswain:perf-imageref-from-pil

maxdswain commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

mergify bot commented Mar 22, 2026

Uh oh!

cau-git commented Mar 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

maxdswain commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

mergify bot commented Mar 22, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

cau-git commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxdswain commented Mar 22, 2026 •

edited

Loading

cau-git commented Mar 23, 2026 •

edited

Loading